[receiver/awslambda] Add multi-format S3 log routing by MichaelKatsoulis · Pull Request #47237 · open-telemetry/opentelemetry-collector-contrib

MichaelKatsoulis · 2026-03-30T15:07:27Z

Description

This PR adds support for routing S3 objects to different encoding extensions based on their key prefix within a
single Lambda deployment.
This is useful when a Lambda receives events from S3 buckets that
store multiple log types (e.g. VPC Flow Logs and CloudTrail in the same bucket, or across
multiple buckets with different log types).

We introduce a new encodings field in the S3 receiver config that can be used like this:

extensions:
  awslogs_encoding/vpcflow:
    format: vpcflow
    vpcflow:
      file_format: plain-text
  awslogs_encoding/cloudtrail:
    format: cloudtrail

receivers:
  awslambda:
    s3:
      encodings:
        - name: vpcflow
          encoding: awslogs_encoding/vpcflow     # decode VPC Flow Log fields into structured records
        - name: cloudtrail
          encoding: awslogs_encoding/cloudtrail  # decode CloudTrail JSON events into structured records
          path_pattern: "myorg/*/CloudTrail"     # optional: override default (AWSLogs/*/CloudTrail); omit to use the default
        - name: catchall
          path_pattern: "*"                      # forward anything else as raw bytes

The existing encoding field is unchanged. encoding and encodings are mutually
exclusive.

Link to tracking issue

Part of #46458

Testing

Unit Testing.
Pending E2E testing.

Documentation

Readme has been updated.

MichaelKatsoulis · 2026-03-30T15:15:32Z

@axw This is the first implementation of the multi-format log support in awslambda receiver.
It focuses only in logs stored in S3.

One thing that I am not sure about is the multiFormatS3LogsHandler which lives alongside to the pre-existing S3LogsHandler.
When we add support for multi-format for logs coming from CloudWatch we could maybe create a single routedLogsHandler or something similar to be used for both cases of S3 and CW.

axw · 2026-04-02T00:16:59Z

receiver/awslambdareceiver/match.go

+	patternParts := strings.Split(pattern, "/")
+	targetParts := strings.Split(target, "/")


We'll be doing a lot of pattern matching, should we optimise this?

The pattern can be split at receiver construction time, so we only create a slice once per pattern

The object key can be split once per S3 event, rather than once per encoding pattern

axw · 2026-04-02T00:25:29Z

receiver/awslambdareceiver/match.go

+		}
+		if strings.Contains(p, "*") {
+			// Glob matching within a segment (e.g. "eni-*", "*_CloudTrail_*").
+			matched, _ := path.Match(p, targetParts[i])


This will also handle "?" and character classes etc. Do we want that?

If yes, we should document it. If no, we should probably create a variation of path.Match that only handles *. I think we'd be better of simplifying.

axw · 2026-04-02T01:17:02Z

receiver/awslambdareceiver/match.go

+//	matchPrefixWithWildcard("eni-0abc123-all", "eni-*")                                  => true
+//	matchPrefixWithWildcard("123_CloudTrail_us-east-1", "*_CloudTrail_*")                => true
+//	matchPrefixWithWildcard("any/path", "*")                                             => true
+func matchPrefixWithWildcard(target, pattern string) bool {


Would we be better off supporting only a single * or an exact match for S3 path parts? It seems unnecessary to support glob matching in the middle of a path part.

That would simplify the matching -- no need for path.Match, just an exact match or accept anything. I think that makes things easier to reason about. We could also optimise the code further by using a trie (prefix tree), but that's probably overkill given how much time will be spend in matching vs. I/O and parsing.

If we did this, then I think we'll end up with two flavours of patterns:

For S3 object keys: path matching with either exact path part match or full wildcard part match.

For CloudWatch log group and stream names: prefix and/or suffix wildcard only (*_foo_*, but not foo*bar)

WDYT?

I agree at all the simplification suggestions.
I was initially thinking, "why not support that too?", but that's seems to be an overkill and complicates things.

…m:MichaelKatsoulis/opentelemetry-collector-contrib into feature/awslambdareceiver-s3-multi-format

MichaelKatsoulis · 2026-04-02T08:29:12Z

@axw after doing these two main things:

Pattern split moved to construction time. strings.Split(pattern, "/") now runs once per encoding when the router is built, not on every GetDecoder call.
Object key split moved outside the pattern loop. strings.Split(objectKey, "/") now runs once at the top of GetDecoder, then the same targetParts slice is reused for every pattern comparison.

we can see clear performance gains.

Clear difference:

	ns/op	B/op	allocs/op
Split every call (old)	1786	1584	18
Pre-split patterns + split key once (new)	382	384	3

MichaelKatsoulis added 5 commits March 30, 2026 17:53

add S3 key prefix pattern matching

32ed6f3

add S3Config with multi-encoding support

24eff4e

add logsDecoderRouter for S3 object key routing

2b8a4d2

update receiver and handler with s3 multi-format

6067b1f

document multi-format S3 encodings and create change log

1f7ebe7

MichaelKatsoulis requested review from a team, axw and pjanotti as code owners March 30, 2026 15:07

MichaelKatsoulis marked this pull request as draft March 30, 2026 15:07

github-actions bot assigned VihasMakwana Mar 30, 2026

github-actions bot added the receiver/awslambda label Mar 30, 2026

MichaelKatsoulis added 3 commits March 31, 2026 10:53

Update config schema

645e96a

Merge branch 'main' into feature/awslambdareceiver-s3-multi-format

274ff80

Merge branch 'main' into feature/awslambdareceiver-s3-multi-format

b4ad899

axw reviewed Apr 2, 2026

View reviewed changes

MichaelKatsoulis added 3 commits April 2, 2026 11:24

optimise S3 key matching

9f4107e

Merge branch 'feature/awslambdareceiver-s3-multi-format' of github.co…

d3884a0

…m:MichaelKatsoulis/opentelemetry-collector-contrib into feature/awslambdareceiver-s3-multi-format

Remove benchmark comparison tests

4c1142a

MichaelKatsoulis added 3 commits April 2, 2026 13:06

Remove unused variable

4e145b9

Merge branch 'main' into feature/awslambdareceiver-s3-multi-format

3d9d51f

Update comment in function

f76f299

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/awslambda] Add multi-format S3 log routing#47237

[receiver/awslambda] Add multi-format S3 log routing#47237
MichaelKatsoulis wants to merge 14 commits intoopen-telemetry:mainfrom
MichaelKatsoulis:feature/awslambdareceiver-s3-multi-format

MichaelKatsoulis commented Mar 30, 2026

Uh oh!

MichaelKatsoulis commented Mar 30, 2026

Uh oh!

axw Apr 2, 2026

Uh oh!

axw Apr 2, 2026

Uh oh!

axw Apr 2, 2026

Uh oh!

MichaelKatsoulis Apr 2, 2026

Uh oh!

MichaelKatsoulis commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		patternParts := strings.Split(pattern, "/")
		targetParts := strings.Split(target, "/")

Conversation

MichaelKatsoulis commented Mar 30, 2026

Description

Link to tracking issue

Testing

Documentation

Uh oh!

MichaelKatsoulis commented Mar 30, 2026

Uh oh!

axw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

axw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

axw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

MichaelKatsoulis Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

MichaelKatsoulis commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants